Implementation of a Modern Web Search Engine Cluster

نویسندگان

Maxim Lifantsev

Tzi-cker Chiueh

چکیده

Yuntis is a fully-functional prototype of a complete web search engine with features comparable to those available in commercial-grade search engines. In particular, Yuntis supports page quality scoring based on global web linkage graph, extensively exploits text associated with links, computes pages’ keywords and lists of similar pages of good quality, and provides a very flexible query language. This paper reports our experiences in the three-year development process of Yuntis, by presenting its design issues, software architecture, implementation details, and performance measurements.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Hybrid Method for Web Pages Ranking in Search Engines

There are many algorithms for optimizing the search engine results, ranking takes place according to one or more parameters such as; Backward Links, Forward Links, Content, click through rate and etc. The quality and performance of these algorithms depend on the listed parameters. The ranking is one of the most important components of the search engine that represents the degree of the vitality...

متن کامل

Use of Semantic Similarity and Web Usage Mining to Alleviate the Drawbacks of User-Based Collaborative Filtering Recommender Systems

One of the most famous methods for recommendation is user-based Collaborative Filtering (CF). This system compares active user’s items rating with historical rating records of other users to find similar users and recommending items which seems interesting to these similar users and have not been rated by the active user. As a way of computing recommendations, the ultimate goal of the user-ba...

متن کامل

Design and Implementation of Scalable, Fully Distributed Web Crawler for a Web Search Engine

The Web is a context in which traditional Information Retrieval methods are challenged. Given the volume of the Web and its speed of change, the coverage of modern web search engines is relatively small. Search engines attempt to crawl the web exhaustively with crawler for new pages, and to keep track of changes made to pages visited earlier. The centralized design of crawlers introduces limita...

متن کامل

The Implementation of Hadoop-based Crawler System and Graphlite-based PageRank-Calculation In Search Engine

Nowadays, the size of the Internet is experiencing rapid growth. As of December 2014, the number of global Internet websites has more than 1 billion and all kinds of information resources are integrated together on the Internet , however,the search engine is to be a necessary tool for all users to retrieve useful information from vast amounts of web data. Generally speaking, a complete search e...

متن کامل

An Algorithm for Clustering of Web Search Results

In this thesis we propose a description-oriented algorithm for clustering of results obtained from Web search engines called LINGO. The key idea of our method is to first discover meaningful cluster labels and then, based on the labels, determine the actual content of the groups. We show how the cluster label discovery can be accomplished with the use of the Latent Semantic Indexing technique. ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

Implementation of a Modern Web Search Engine Cluster

نویسندگان

چکیده

منابع مشابه

A New Hybrid Method for Web Pages Ranking in Search Engines

Use of Semantic Similarity and Web Usage Mining to Alleviate the Drawbacks of User-Based Collaborative Filtering Recommender Systems

Design and Implementation of Scalable, Fully Distributed Web Crawler for a Web Search Engine

The Implementation of Hadoop-based Crawler System and Graphlite-based PageRank-Calculation In Search Engine

An Algorithm for Clustering of Web Search Results

عنوان ژورنال:

اشتراک گذاری